What Is Claimed Is: 



1. A document image processing device, comprising: 

a predetermined pixel block extraction part that extracts a 
predetermined pixel block that appears commonly on at least some pages 
from an input document image; and 

an image correction part that corrects a location of the whole input 
document image so that a position of the predetermined pixel block 
extracted by the predetermined pixel block extraction part is coincident with 
a reference position or a position of a reference pixel block in the document 
image. 

2. The document image processing device according to claim 1, 
further comprising: 

a reference position designation part that causes a user to designate 
the reference position or the position of the reference pixel block in the 
document image, 

wherein the image correction part corrects the location of the whole 
input document image so that the position of the predetermined pixel block 
extracted by the predetermined pixel block extraction part is coincident with 
the reference position or the position of the reference pixel block in the 
document image designated by the reference position designation part. 

3. The document image processing device according to claim 1, 
further comprising: 

an image memory part that holds the input document image per each 

page, 

wherein the predetermined pixel block extraction part analyzes a 
layout of the document image in plural pages to be processed stored in the 



image memory part, and if there is approximately the same pixel block at a 
same position in the document image of each page, the predetermined pixel 
block extraction part regards the pixel block as a predetermined pixel block 
and determines the reference position. 

4. The document image processing device according to claim 1, 
further comprising: 

an image memory part that holds the input document image per 
page; and 

a reference position designation part that causes a user to designate 
the reference position or the position of the reference pixel block in the 
document image, 

wherein the predetermined pixel block extraction part analyzes a 
layout of the document image of all the pages to be processed stored in the 
image memory part, and if there is approximately the same pixel block at a 
same position in the document image of each page, the predetermined pixel 
block extraction part regards this pixel block as the predetermined pixel 
block, and the image correction part corrects a location of the whole input 
document image so that a position of the predetermined pixel block 
extracted by the predetermined pixel block extraction part is coincident with 
the reference position or the position of the reference pixel block designated 
by the reference position designation part. 

5. The document image processing device according to claim 1, 
wherein the predetermined pixel block extraction part comprises a 
rectangular frame extraction part that extracts pixel block rectangular frames 
from the document image, a character string direction designation part that 
specifies a character string direction of the document image, a connected 
rectangular frame generation part that connects the rectangular frames in the 
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direction designated by the character string direction designation part, and a 
connected rectangular frame extraction part that extracts the connected 
rectangular frame located nearest to the reference position or the position of 
the reference pixel block. 

6. The document image processing device according to claim 5, 
wherein the character string direction designation part comprises a user 
interface that causes a user to designate the character string direction. 

7. The document image processing device according to claim 5, 
wherein the character string direction designation part comprises a document 
layout analysis part that specifies the character string direction by analyzing 
the layout of a document image. 

8. The document image processing device according to claim 7, 
wherein the document layout analysis part extracts runs of white pixels to be 
a background of the document image in both vertical and horizontal 
directions, connects adjacent runs of white pixels having a value equal to or 
larger than a predetermined threshold value to form a rectangular frame of a 
white pixel region in both vertical and horizontal directions, extracts 
rectangular frames having a width equal to or larger than a predetermined 
value from the rectangular frames in both vertical and horizontal directions, 
compares between the number of rectangular frames extracted in the vertical 
direction and the number of rectangular frames extracted in the horizontal 
direction, and determines the direction of the larger number as the character 
string direction of the document. 

9. The document image processing device according to claim 1, 
further comprising an undetected log generation part that records 
information of the document image from which the predetermined pixel 
block extraction part cannot extracts the predetermined pixel block. 
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10. The document image processing device according to claim 2, 
wherein the reference position designation part comprises an odd number 
page reference position designation part that designates the reference 
position or the position of the reference pixel block in odd number pages, an 
even number page reference position designation part that designates the 
reference position or the position of the reference pixel block in even 
number pages, and a page switching part that switches between outputs from 
the odd number page reference position designation part and the even 
number page reference position designation part depending on whether the 
page number is even or odd, thus making it possible to set respective 
separate extraction regions for the odd number page and the even number 
page. 

11. The document image processing device according to claim 3, 
wherein, if approximately the same pixel block is found at a same position in 
the document image on odd number pages, the predetermined pixel block 
extraction part regards the pixel block as the predetermined pixel block on 
odd number pages, and if approximately the same pixel block is found at a 
same position in the document image on even number pages, regards the 
pixel block as the predetermined pixel block on even number pages. 

12. The document image processing device according to claim 1, 
further comprising a skew correction part that corrects skew of the input 
document image. 

13. The document image processing device according to claim 12, 
wherein the skew correction part subjects a center coordinate of a 
rectangular frame of pixel blocks to Hough transform to detect a skew angle. 

14. The document image processing device according to claim 1, 
wherein the predetermined pixel block corresponds to a page number image, 
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the document image processing device further comprising: 

a character recognition part that recognizes a character in an image; 

and 

a sort part that sorts the pages in the page number order after the 
image correction part corrects the location of the whole input document 
image and the character recognition part recognizes the page number 
character in the page number image. 

15. A document image processing method, comprising: 
causing a user to designate in advance a reference position or a 

position of a reference pixel block; 

extracting a predetermined pixel block commonly appearing at least 
in some pages from an input document image; and 

correcting a location of the whole input document image so that a 
position of the extracted predetermined pixel block is coincident with the 
reference position or the position of the reference pixel block. 

16. A document image processing method, comprising: 
analyzing a layout of an input document image in plural pages to be 

processed; 

if there is approximately the same pixel block at a similar position 
in the input document image in each page, determining the pixel block as a 
predetermined pixel block and determining a reference position; and 

correcting a location of the whole input document image so that a 
position of the predetermined pixel block appearing in the input document 
image in each page is coincident with the reference position. 

17. A document image processing method, comprising: 
causing a user to designate in advance a reference position; 
analyzing a layout of an input document image in plural pages to be 
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processed; 

if there is approximately the same pixel block at a similar position 
in the input document image in each page, determining the pixel block as a 
predetermined pixel block; and 

correcting a location of the whole input document image so that a 
position of the predetermined pixel block appearing in the input document 
image in each page is coincident with the reference position. 

18. The document image processing method according to claim 15, 
wherein if the predetermined pixel block cannot be extracted from the input 
document image, information of the document image is recorded. 

19. A memory medium readable by a computer, the medium storing 
a program of instructions executable by the computer to perform a function 
comprising the steps of: 

receiving a reference position or a position of a reference pixel 
block designated in advance by a user; 

extracting a predetermined pixel block commonly appearing at least 
in some pages from an input document image; and 

correcting a location of the whole input document image so that a 
position of the extracted predetermined pixel block is coincident with the 
reference position or the position of the reference pixel block. 

20. A memory medium readable by a computer, the medium storing 
a program of instructions executable by the computer to perform a function 
comprising the steps of: 

analyzing a layout of an input document image in plural pages to be 
processed; 

if there is approximately the same pixel block at a similar position 
in the input document image in each page, determining the pixel block as a 
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predetermined pixel block and determining a reference position; and 

correcting a location of the whole input document image so that a 
position of the predetermined pixel block appearing in the input document 
image in each page is coincident with the reference position. 

21. A memory medium readable by a computer, the medium storing 
a program of instructions executable by the computer to perform a function 
comprising the steps of: 

receiving a reference position designated in advance by a user; 
analyzing a layout of an input document image in plural pages to be 
processed; 

if there is approximately the same pixel block at a similar position 
in the input document image in each page, determining the pixel block as a 
predetermined pixel block; and 

correcting a location of the whole input document image so that a 
position of the predetermined pixel block appearing in the input document 
image in each page is coincident with the reference position. 

22. The memory medium according to claim 19, wherein, if the 
predetermined pixel block cannot be extracted from a document image, 
information of the document image is recorded. 
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